Overview

Dataset statistics

Number of variables26
Number of observations226763
Missing cells1652364
Missing cells (%)28.0%
Duplicate rows655
Duplicate rows (%)0.3%
Total size in memory45.0 MiB
Average record size in memory208.0 B

Variable types

DateTime2
Numeric2
Categorical19
Text3

Alerts

Dataset has 655 (0.3%) duplicate rowsDuplicates
GARGANTA is highly overall correlated with DOR_ABD and 2 other fieldsHigh correlation
DISPNEIA is highly overall correlated with DESC_RESPHigh correlation
DESC_RESP is highly overall correlated with DISPNEIA and 1 other fieldsHigh correlation
SATURACAO is highly overall correlated with DESC_RESPHigh correlation
DIARREIA is highly overall correlated with VOMITOHigh correlation
VOMITO is highly overall correlated with DIARREIAHigh correlation
DOR_ABD is highly overall correlated with GARGANTA and 3 other fieldsHigh correlation
FADIGA is highly overall correlated with DOR_ABDHigh correlation
PERD_OLFT is highly overall correlated with GARGANTA and 2 other fieldsHigh correlation
PERD_PALA is highly overall correlated with GARGANTA and 2 other fieldsHigh correlation
ESTRANG is highly imbalanced (93.6%)Imbalance
CS_ZONA is highly imbalanced (73.3%)Imbalance
TOSSE is highly imbalanced (56.6%)Imbalance
DIARREIA is highly imbalanced (63.1%)Imbalance
VOMITO is highly imbalanced (54.4%)Imbalance
DOR_ABD is highly imbalanced (63.8%)Imbalance
PERD_OLFT is highly imbalanced (76.2%)Imbalance
PERD_PALA is highly imbalanced (76.5%)Imbalance
ESTRANG has 18945 (8.4%) missing valuesMissing
CS_ZONA has 19427 (8.6%) missing valuesMissing
OUT_ANIM has 226032 (99.7%) missing valuesMissing
FEBRE has 32100 (14.2%) missing valuesMissing
TOSSE has 20150 (8.9%) missing valuesMissing
GARGANTA has 63540 (28.0%) missing valuesMissing
DISPNEIA has 29836 (13.2%) missing valuesMissing
DESC_RESP has 36846 (16.2%) missing valuesMissing
SATURACAO has 41637 (18.4%) missing valuesMissing
DIARREIA has 64895 (28.6%) missing valuesMissing
VOMITO has 63050 (27.8%) missing valuesMissing
DOR_ABD has 66861 (29.5%) missing valuesMissing
FADIGA has 62595 (27.6%) missing valuesMissing
PERD_OLFT has 68822 (30.3%) missing valuesMissing
PERD_PALA has 69072 (30.5%) missing valuesMissing
OUTRO_SIN has 67384 (29.7%) missing valuesMissing
OUTRO_DES has 163149 (71.9%) missing valuesMissing
VACINA has 106570 (47.0%) missing valuesMissing
DOSE_2REF has 189217 (83.4%) missing valuesMissing
CLASSI_FIN has 17353 (7.7%) missing valuesMissing
CLASSI_OUT has 224773 (99.1%) missing valuesMissing

Reproduction

Analysis started2023-11-01 17:17:05.055421
Analysis finished2023-11-01 17:17:56.551385
Duration51.5 seconds
Software versionydata-profiling vv4.6.1
Download configurationconfig.json

Variables

Distinct288
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size1.7 MiB
Minimum2023-01-01 00:00:00
Maximum2023-12-10 00:00:00
2023-11-01T14:17:57.237901image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-01T14:17:57.960305image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

SEM_PRI
Real number (ℝ)

Distinct42
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19.334662
Minimum1
Maximum42
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2023-11-01T14:17:58.596231image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q111
median19
Q327
95-th percentile37
Maximum42
Range41
Interquartile range (IQR)16

Descriptive statistics

Standard deviation10.44513
Coefficient of variation (CV)0.54022823
Kurtosis-0.88824396
Mean19.334662
Median Absolute Deviation (MAD)8
Skewness0.17118062
Sum4384386
Variance109.10075
MonotonicityNot monotonic
2023-11-01T14:17:59.216766image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=42)
ValueCountFrequency (%)
20 8436
 
3.7%
18 7926
 
3.5%
21 7705
 
3.4%
19 7644
 
3.4%
12 7622
 
3.4%
15 7582
 
3.3%
16 7335
 
3.2%
13 7250
 
3.2%
22 7117
 
3.1%
11 7109
 
3.1%
Other values (32) 151037
66.6%
ValueCountFrequency (%)
1 5767
2.5%
2 4550
2.0%
3 4039
1.8%
4 3648
1.6%
5 4216
1.9%
6 5172
2.3%
7 5835
2.6%
8 5975
2.6%
9 6419
2.8%
10 6726
3.0%
ValueCountFrequency (%)
42 6
 
< 0.1%
41 789
 
0.3%
40 2961
1.3%
39 3440
1.5%
38 4030
1.8%
37 4071
1.8%
36 3901
1.7%
35 4102
1.8%
34 4426
2.0%
33 4443
2.0%

ESTRANG
Categorical

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing18945
Missing (%)8.4%
Memory size1.7 MiB
2.0
206255 
1.0
 
1563

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters623454
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row2.0
3rd row2.0
4th row2.0
5th row2.0

Common Values

ValueCountFrequency (%)
2.0 206255
91.0%
1.0 1563
 
0.7%
(Missing) 18945
 
8.4%

Length

2023-11-01T14:17:59.671227image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-01T14:18:00.031535image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
2.0 206255
99.2%
1.0 1563
 
0.8%

Most occurring characters

ValueCountFrequency (%)
. 207818
33.3%
0 207818
33.3%
2 206255
33.1%
1 1563
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 415636
66.7%
Other Punctuation 207818
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 207818
50.0%
2 206255
49.6%
1 1563
 
0.4%
Other Punctuation
ValueCountFrequency (%)
. 207818
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 623454
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 207818
33.3%
0 207818
33.3%
2 206255
33.1%
1 1563
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 623454
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 207818
33.3%
0 207818
33.3%
2 206255
33.1%
1 1563
 
0.3%

NU_IDADE_N
Real number (ℝ)

Distinct121
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31.029171
Minimum-5
Maximum123
Zeros1402
Zeros (%)0.6%
Negative3
Negative (%)< 0.1%
Memory size1.7 MiB
2023-11-01T14:18:00.377265image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum-5
5-th percentile1
Q13
median9
Q365
95-th percentile87
Maximum123
Range128
Interquartile range (IQR)62

Descriptive statistics

Standard deviation32.817006
Coefficient of variation (CV)1.0576179
Kurtosis-1.3002122
Mean31.029171
Median Absolute Deviation (MAD)8
Skewness0.6170725
Sum7036268
Variance1076.9559
MonotonicityNot monotonic
2023-11-01T14:18:00.877712image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 26456
 
11.7%
2 18114
 
8.0%
3 14738
 
6.5%
4 12882
 
5.7%
5 10834
 
4.8%
6 9071
 
4.0%
7 7754
 
3.4%
8 6661
 
2.9%
9 5695
 
2.5%
10 5037
 
2.2%
Other values (111) 109521
48.3%
ValueCountFrequency (%)
-5 1
 
< 0.1%
-3 1
 
< 0.1%
-2 1
 
< 0.1%
0 1402
 
0.6%
1 26456
11.7%
2 18114
8.0%
3 14738
6.5%
4 12882
5.7%
5 10834
4.8%
6 9071
 
4.0%
ValueCountFrequency (%)
123 2
< 0.1%
117 1
 
< 0.1%
115 1
 
< 0.1%
114 1
 
< 0.1%
113 3
< 0.1%
112 1
 
< 0.1%
111 3
< 0.1%
110 4
< 0.1%
109 3
< 0.1%
108 2
< 0.1%

SG_UF
Categorical

Distinct27
Distinct (%)< 0.1%
Missing54
Missing (%)< 0.1%
Memory size1.7 MiB
SP
64239 
PR
22660 
MG
19724 
RJ
15533 
RS
12034 
Other values (22)
92519 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters453418
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMG
2nd rowRJ
3rd rowSP
4th rowSP
5th rowSP

Common Values

ValueCountFrequency (%)
SP 64239
28.3%
PR 22660
 
10.0%
MG 19724
 
8.7%
RJ 15533
 
6.8%
RS 12034
 
5.3%
CE 11382
 
5.0%
SC 9669
 
4.3%
BA 9130
 
4.0%
DF 8045
 
3.5%
PE 7340
 
3.2%
Other values (17) 46953
20.7%

Length

2023-11-01T14:18:01.211108image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
sp 64239
28.3%
pr 22660
 
10.0%
mg 19724
 
8.7%
rj 15533
 
6.9%
rs 12034
 
5.3%
ce 11382
 
5.0%
sc 9669
 
4.3%
ba 9130
 
4.0%
df 8045
 
3.5%
pe 7340
 
3.2%
Other values (17) 46953
20.7%

Most occurring characters

ValueCountFrequency (%)
P 105077
23.2%
S 98389
21.7%
R 55268
12.2%
M 33592
 
7.4%
G 27046
 
6.0%
E 24773
 
5.5%
A 24046
 
5.3%
C 23179
 
5.1%
J 15533
 
3.4%
B 12873
 
2.8%
Other values (7) 33642
 
7.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 453418
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P 105077
23.2%
S 98389
21.7%
R 55268
12.2%
M 33592
 
7.4%
G 27046
 
6.0%
E 24773
 
5.5%
A 24046
 
5.3%
C 23179
 
5.1%
J 15533
 
3.4%
B 12873
 
2.8%
Other values (7) 33642
 
7.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 453418
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
P 105077
23.2%
S 98389
21.7%
R 55268
12.2%
M 33592
 
7.4%
G 27046
 
6.0%
E 24773
 
5.5%
A 24046
 
5.3%
C 23179
 
5.1%
J 15533
 
3.4%
B 12873
 
2.8%
Other values (7) 33642
 
7.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 453418
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
P 105077
23.2%
S 98389
21.7%
R 55268
12.2%
M 33592
 
7.4%
G 27046
 
6.0%
E 24773
 
5.5%
A 24046
 
5.3%
C 23179
 
5.1%
J 15533
 
3.4%
B 12873
 
2.8%
Other values (7) 33642
 
7.4%

CS_ZONA
Categorical

IMBALANCE  MISSING 

Distinct4
Distinct (%)< 0.1%
Missing19427
Missing (%)8.6%
Memory size1.7 MiB
1.0
189066 
2.0
 
12793
9.0
 
3079
3.0
 
2398

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters622008
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0 189066
83.4%
2.0 12793
 
5.6%
9.0 3079
 
1.4%
3.0 2398
 
1.1%
(Missing) 19427
 
8.6%

Length

2023-11-01T14:18:01.564553image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-01T14:18:01.856592image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
1.0 189066
91.2%
2.0 12793
 
6.2%
9.0 3079
 
1.5%
3.0 2398
 
1.2%

Most occurring characters

ValueCountFrequency (%)
. 207336
33.3%
0 207336
33.3%
1 189066
30.4%
2 12793
 
2.1%
9 3079
 
0.5%
3 2398
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 414672
66.7%
Other Punctuation 207336
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 207336
50.0%
1 189066
45.6%
2 12793
 
3.1%
9 3079
 
0.7%
3 2398
 
0.6%
Other Punctuation
ValueCountFrequency (%)
. 207336
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 622008
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 207336
33.3%
0 207336
33.3%
1 189066
30.4%
2 12793
 
2.1%
9 3079
 
0.5%
3 2398
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 622008
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 207336
33.3%
0 207336
33.3%
1 189066
30.4%
2 12793
 
2.1%
9 3079
 
0.5%
3 2398
 
0.4%

OUT_ANIM
Text

MISSING 

Distinct115
Distinct (%)15.7%
Missing226032
Missing (%)99.7%
Memory size1.7 MiB
2023-11-01T14:18:02.245740image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length33
Median length31
Mean length8.8604651
Min length1

Characters and Unicode

Total characters6477
Distinct characters32
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique82 ?
Unique (%)11.2%

Sample

1st rowGATO E CACHORRO
2nd rowCACHORRO
3rd rowCACHORRO
4th rowGATO
5th rowCACHORRO
ValueCountFrequency (%)
cachorro 445
45.0%
gato 182
18.4%
e 76
 
7.7%
cao 38
 
3.8%
gatos 29
 
2.9%
gato,cachorro 14
 
1.4%
de 12
 
1.2%
cachorros 10
 
1.0%
estimacao 10
 
1.0%
animais 9
 
0.9%
Other values (89) 163
 
16.5%
2023-11-01T14:18:03.267471image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
O 1379
21.3%
C 1057
16.3%
R 1002
15.5%
A 949
14.7%
H 507
 
7.8%
T 278
 
4.3%
G 276
 
4.3%
258
 
4.0%
E 165
 
2.5%
S 116
 
1.8%
Other values (22) 490
 
7.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 6117
94.4%
Space Separator 258
 
4.0%
Other Punctuation 97
 
1.5%
Dash Punctuation 2
 
< 0.1%
Decimal Number 2
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O 1379
22.5%
C 1057
17.3%
R 1002
16.4%
A 949
15.5%
H 507
 
8.3%
T 278
 
4.5%
G 276
 
4.5%
E 165
 
2.7%
S 116
 
1.9%
I 103
 
1.7%
Other values (13) 285
 
4.7%
Other Punctuation
ValueCountFrequency (%)
, 67
69.1%
/ 14
 
14.4%
. 13
 
13.4%
; 3
 
3.1%
Decimal Number
ValueCountFrequency (%)
3 1
50.0%
1 1
50.0%
Space Separator
ValueCountFrequency (%)
258
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6117
94.4%
Common 360
 
5.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
O 1379
22.5%
C 1057
17.3%
R 1002
16.4%
A 949
15.5%
H 507
 
8.3%
T 278
 
4.5%
G 276
 
4.5%
E 165
 
2.7%
S 116
 
1.9%
I 103
 
1.7%
Other values (13) 285
 
4.7%
Common
ValueCountFrequency (%)
258
71.7%
, 67
 
18.6%
/ 14
 
3.9%
. 13
 
3.6%
; 3
 
0.8%
- 2
 
0.6%
3 1
 
0.3%
1 1
 
0.3%
+ 1
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6477
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O 1379
21.3%
C 1057
16.3%
R 1002
15.5%
A 949
14.7%
H 507
 
7.8%
T 278
 
4.3%
G 276
 
4.3%
258
 
4.0%
E 165
 
2.5%
S 116
 
1.8%
Other values (22) 490
 
7.6%

FEBRE
Categorical

MISSING 

Distinct3
Distinct (%)< 0.1%
Missing32100
Missing (%)14.2%
Memory size1.7 MiB
1.0
128752 
2.0
64489 
9.0
 
1422

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters583989
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row2.0
4th row1.0
5th row2.0

Common Values

ValueCountFrequency (%)
1.0 128752
56.8%
2.0 64489
28.4%
9.0 1422
 
0.6%
(Missing) 32100
 
14.2%

Length

2023-11-01T14:18:03.863122image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-01T14:18:04.343567image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
1.0 128752
66.1%
2.0 64489
33.1%
9.0 1422
 
0.7%

Most occurring characters

ValueCountFrequency (%)
. 194663
33.3%
0 194663
33.3%
1 128752
22.0%
2 64489
 
11.0%
9 1422
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 389326
66.7%
Other Punctuation 194663
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 194663
50.0%
1 128752
33.1%
2 64489
 
16.6%
9 1422
 
0.4%
Other Punctuation
ValueCountFrequency (%)
. 194663
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 583989
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 194663
33.3%
0 194663
33.3%
1 128752
22.0%
2 64489
 
11.0%
9 1422
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 583989
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 194663
33.3%
0 194663
33.3%
1 128752
22.0%
2 64489
 
11.0%
9 1422
 
0.2%

TOSSE
Categorical

IMBALANCE  MISSING 

Distinct3
Distinct (%)< 0.1%
Missing20150
Missing (%)8.9%
Memory size1.7 MiB
1.0
171597 
2.0
33998 
9.0
 
1018

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters619839
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row9.0
3rd row1.0
4th row2.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0 171597
75.7%
2.0 33998
 
15.0%
9.0 1018
 
0.4%
(Missing) 20150
 
8.9%

Length

2023-11-01T14:18:04.960705image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-01T14:18:05.541758image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
1.0 171597
83.1%
2.0 33998
 
16.5%
9.0 1018
 
0.5%

Most occurring characters

ValueCountFrequency (%)
. 206613
33.3%
0 206613
33.3%
1 171597
27.7%
2 33998
 
5.5%
9 1018
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 413226
66.7%
Other Punctuation 206613
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 206613
50.0%
1 171597
41.5%
2 33998
 
8.2%
9 1018
 
0.2%
Other Punctuation
ValueCountFrequency (%)
. 206613
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 619839
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 206613
33.3%
0 206613
33.3%
1 171597
27.7%
2 33998
 
5.5%
9 1018
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 619839
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 206613
33.3%
0 206613
33.3%
1 171597
27.7%
2 33998
 
5.5%
9 1018
 
0.2%

GARGANTA
Categorical

HIGH CORRELATION  MISSING 

Distinct3
Distinct (%)< 0.1%
Missing63540
Missing (%)28.0%
Memory size1.7 MiB
2.0
133476 
1.0
24738 
9.0
 
5009

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters489669
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row9.0
3rd row2.0
4th row2.0
5th row2.0

Common Values

ValueCountFrequency (%)
2.0 133476
58.9%
1.0 24738
 
10.9%
9.0 5009
 
2.2%
(Missing) 63540
28.0%

Length

2023-11-01T14:18:06.235047image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-01T14:18:06.893285image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
2.0 133476
81.8%
1.0 24738
 
15.2%
9.0 5009
 
3.1%

Most occurring characters

ValueCountFrequency (%)
. 163223
33.3%
0 163223
33.3%
2 133476
27.3%
1 24738
 
5.1%
9 5009
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 326446
66.7%
Other Punctuation 163223
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 163223
50.0%
2 133476
40.9%
1 24738
 
7.6%
9 5009
 
1.5%
Other Punctuation
ValueCountFrequency (%)
. 163223
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 489669
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 163223
33.3%
0 163223
33.3%
2 133476
27.3%
1 24738
 
5.1%
9 5009
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 489669
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 163223
33.3%
0 163223
33.3%
2 133476
27.3%
1 24738
 
5.1%
9 5009
 
1.0%

DISPNEIA
Categorical

HIGH CORRELATION  MISSING 

Distinct3
Distinct (%)< 0.1%
Missing29836
Missing (%)13.2%
Memory size1.7 MiB
1.0
146911 
2.0
48674 
9.0
 
1342

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters590781
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row9.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0 146911
64.8%
2.0 48674
 
21.5%
9.0 1342
 
0.6%
(Missing) 29836
 
13.2%

Length

2023-11-01T14:18:07.497628image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-01T14:18:08.077851image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
1.0 146911
74.6%
2.0 48674
 
24.7%
9.0 1342
 
0.7%

Most occurring characters

ValueCountFrequency (%)
. 196927
33.3%
0 196927
33.3%
1 146911
24.9%
2 48674
 
8.2%
9 1342
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 393854
66.7%
Other Punctuation 196927
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 196927
50.0%
1 146911
37.3%
2 48674
 
12.4%
9 1342
 
0.3%
Other Punctuation
ValueCountFrequency (%)
. 196927
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 590781
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 196927
33.3%
0 196927
33.3%
1 146911
24.9%
2 48674
 
8.2%
9 1342
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 590781
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 196927
33.3%
0 196927
33.3%
1 146911
24.9%
2 48674
 
8.2%
9 1342
 
0.2%

DESC_RESP
Categorical

HIGH CORRELATION  MISSING 

Distinct3
Distinct (%)< 0.1%
Missing36846
Missing (%)16.2%
Memory size1.7 MiB
1.0
138790 
2.0
49905 
9.0
 
1222

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters569751
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row9.0
3rd row2.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0 138790
61.2%
2.0 49905
 
22.0%
9.0 1222
 
0.5%
(Missing) 36846
 
16.2%

Length

2023-11-01T14:18:08.633059image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-01T14:18:09.287764image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
1.0 138790
73.1%
2.0 49905
 
26.3%
9.0 1222
 
0.6%

Most occurring characters

ValueCountFrequency (%)
. 189917
33.3%
0 189917
33.3%
1 138790
24.4%
2 49905
 
8.8%
9 1222
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 379834
66.7%
Other Punctuation 189917
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 189917
50.0%
1 138790
36.5%
2 49905
 
13.1%
9 1222
 
0.3%
Other Punctuation
ValueCountFrequency (%)
. 189917
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 569751
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 189917
33.3%
0 189917
33.3%
1 138790
24.4%
2 49905
 
8.8%
9 1222
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 569751
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 189917
33.3%
0 189917
33.3%
1 138790
24.4%
2 49905
 
8.8%
9 1222
 
0.2%

SATURACAO
Categorical

HIGH CORRELATION  MISSING 

Distinct3
Distinct (%)< 0.1%
Missing41637
Missing (%)18.4%
Memory size1.7 MiB
1.0
119698 
2.0
63757 
9.0
 
1671

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters555378
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row9.0
3rd row1.0
4th row1.0
5th row2.0

Common Values

ValueCountFrequency (%)
1.0 119698
52.8%
2.0 63757
28.1%
9.0 1671
 
0.7%
(Missing) 41637
 
18.4%

Length

2023-11-01T14:18:10.012414image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-01T14:18:10.668049image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
1.0 119698
64.7%
2.0 63757
34.4%
9.0 1671
 
0.9%

Most occurring characters

ValueCountFrequency (%)
. 185126
33.3%
0 185126
33.3%
1 119698
21.6%
2 63757
 
11.5%
9 1671
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 370252
66.7%
Other Punctuation 185126
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 185126
50.0%
1 119698
32.3%
2 63757
 
17.2%
9 1671
 
0.5%
Other Punctuation
ValueCountFrequency (%)
. 185126
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 555378
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 185126
33.3%
0 185126
33.3%
1 119698
21.6%
2 63757
 
11.5%
9 1671
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 555378
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 185126
33.3%
0 185126
33.3%
1 119698
21.6%
2 63757
 
11.5%
9 1671
 
0.3%

DIARREIA
Categorical

HIGH CORRELATION  IMBALANCE  MISSING 

Distinct3
Distinct (%)< 0.1%
Missing64895
Missing (%)28.6%
Memory size1.7 MiB
2.0
142670 
1.0
17088 
9.0
 
2110

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters485604
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row9.0
3rd row2.0
4th row2.0
5th row2.0

Common Values

ValueCountFrequency (%)
2.0 142670
62.9%
1.0 17088
 
7.5%
9.0 2110
 
0.9%
(Missing) 64895
28.6%

Length

2023-11-01T14:18:11.268258image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-01T14:18:11.757506image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
2.0 142670
88.1%
1.0 17088
 
10.6%
9.0 2110
 
1.3%

Most occurring characters

ValueCountFrequency (%)
. 161868
33.3%
0 161868
33.3%
2 142670
29.4%
1 17088
 
3.5%
9 2110
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 323736
66.7%
Other Punctuation 161868
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 161868
50.0%
2 142670
44.1%
1 17088
 
5.3%
9 2110
 
0.7%
Other Punctuation
ValueCountFrequency (%)
. 161868
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 485604
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 161868
33.3%
0 161868
33.3%
2 142670
29.4%
1 17088
 
3.5%
9 2110
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 485604
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 161868
33.3%
0 161868
33.3%
2 142670
29.4%
1 17088
 
3.5%
9 2110
 
0.4%

VOMITO
Categorical

HIGH CORRELATION  IMBALANCE  MISSING 

Distinct3
Distinct (%)< 0.1%
Missing63050
Missing (%)27.8%
Memory size1.7 MiB
2.0
135994 
1.0
25602 
9.0
 
2117

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters491139
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row9.0
3rd row2.0
4th row2.0
5th row2.0

Common Values

ValueCountFrequency (%)
2.0 135994
60.0%
1.0 25602
 
11.3%
9.0 2117
 
0.9%
(Missing) 63050
27.8%

Length

2023-11-01T14:18:12.346170image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-01T14:18:12.832043image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
2.0 135994
83.1%
1.0 25602
 
15.6%
9.0 2117
 
1.3%

Most occurring characters

ValueCountFrequency (%)
. 163713
33.3%
0 163713
33.3%
2 135994
27.7%
1 25602
 
5.2%
9 2117
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 327426
66.7%
Other Punctuation 163713
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 163713
50.0%
2 135994
41.5%
1 25602
 
7.8%
9 2117
 
0.6%
Other Punctuation
ValueCountFrequency (%)
. 163713
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 491139
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 163713
33.3%
0 163713
33.3%
2 135994
27.7%
1 25602
 
5.2%
9 2117
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 491139
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 163713
33.3%
0 163713
33.3%
2 135994
27.7%
1 25602
 
5.2%
9 2117
 
0.4%

DOR_ABD
Categorical

HIGH CORRELATION  IMBALANCE  MISSING 

Distinct3
Distinct (%)< 0.1%
Missing66861
Missing (%)29.5%
Memory size1.7 MiB
2.0
143003 
1.0
 
12551
9.0
 
4348

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters479706
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row9.0
3rd row2.0
4th row2.0
5th row2.0

Common Values

ValueCountFrequency (%)
2.0 143003
63.1%
1.0 12551
 
5.5%
9.0 4348
 
1.9%
(Missing) 66861
29.5%

Length

2023-11-01T14:18:13.389297image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-01T14:18:13.870428image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
2.0 143003
89.4%
1.0 12551
 
7.8%
9.0 4348
 
2.7%

Most occurring characters

ValueCountFrequency (%)
. 159902
33.3%
0 159902
33.3%
2 143003
29.8%
1 12551
 
2.6%
9 4348
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 319804
66.7%
Other Punctuation 159902
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 159902
50.0%
2 143003
44.7%
1 12551
 
3.9%
9 4348
 
1.4%
Other Punctuation
ValueCountFrequency (%)
. 159902
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 479706
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 159902
33.3%
0 159902
33.3%
2 143003
29.8%
1 12551
 
2.6%
9 4348
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 479706
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 159902
33.3%
0 159902
33.3%
2 143003
29.8%
1 12551
 
2.6%
9 4348
 
0.9%

FADIGA
Categorical

HIGH CORRELATION  MISSING 

Distinct3
Distinct (%)< 0.1%
Missing62595
Missing (%)27.6%
Memory size1.7 MiB
2.0
125302 
1.0
35280 
9.0
 
3586

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters492504
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row9.0
3rd row2.0
4th row2.0
5th row2.0

Common Values

ValueCountFrequency (%)
2.0 125302
55.3%
1.0 35280
 
15.6%
9.0 3586
 
1.6%
(Missing) 62595
27.6%

Length

2023-11-01T14:18:14.462643image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-01T14:18:15.064093image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
2.0 125302
76.3%
1.0 35280
 
21.5%
9.0 3586
 
2.2%

Most occurring characters

ValueCountFrequency (%)
. 164168
33.3%
0 164168
33.3%
2 125302
25.4%
1 35280
 
7.2%
9 3586
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 328336
66.7%
Other Punctuation 164168
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 164168
50.0%
2 125302
38.2%
1 35280
 
10.7%
9 3586
 
1.1%
Other Punctuation
ValueCountFrequency (%)
. 164168
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 492504
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 164168
33.3%
0 164168
33.3%
2 125302
25.4%
1 35280
 
7.2%
9 3586
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 492504
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 164168
33.3%
0 164168
33.3%
2 125302
25.4%
1 35280
 
7.2%
9 3586
 
0.7%

PERD_OLFT
Categorical

HIGH CORRELATION  IMBALANCE  MISSING 

Distinct3
Distinct (%)< 0.1%
Missing68822
Missing (%)30.3%
Memory size1.7 MiB
2.0
148480 
9.0
 
6906
1.0
 
2555

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters473823
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row9.0
3rd row2.0
4th row2.0
5th row2.0

Common Values

ValueCountFrequency (%)
2.0 148480
65.5%
9.0 6906
 
3.0%
1.0 2555
 
1.1%
(Missing) 68822
30.3%

Length

2023-11-01T14:18:15.694827image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-01T14:18:16.179147image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
2.0 148480
94.0%
9.0 6906
 
4.4%
1.0 2555
 
1.6%

Most occurring characters

ValueCountFrequency (%)
. 157941
33.3%
0 157941
33.3%
2 148480
31.3%
9 6906
 
1.5%
1 2555
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 315882
66.7%
Other Punctuation 157941
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 157941
50.0%
2 148480
47.0%
9 6906
 
2.2%
1 2555
 
0.8%
Other Punctuation
ValueCountFrequency (%)
. 157941
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 473823
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 157941
33.3%
0 157941
33.3%
2 148480
31.3%
9 6906
 
1.5%
1 2555
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 473823
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 157941
33.3%
0 157941
33.3%
2 148480
31.3%
9 6906
 
1.5%
1 2555
 
0.5%

PERD_PALA
Categorical

HIGH CORRELATION  IMBALANCE  MISSING 

Distinct3
Distinct (%)< 0.1%
Missing69072
Missing (%)30.5%
Memory size1.7 MiB
2.0
148398 
9.0
 
6894
1.0
 
2399

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters473073
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row9.0
3rd row2.0
4th row2.0
5th row2.0

Common Values

ValueCountFrequency (%)
2.0 148398
65.4%
9.0 6894
 
3.0%
1.0 2399
 
1.1%
(Missing) 69072
30.5%

Length

2023-11-01T14:18:16.720855image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-01T14:18:17.203832image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
2.0 148398
94.1%
9.0 6894
 
4.4%
1.0 2399
 
1.5%

Most occurring characters

ValueCountFrequency (%)
. 157691
33.3%
0 157691
33.3%
2 148398
31.4%
9 6894
 
1.5%
1 2399
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 315382
66.7%
Other Punctuation 157691
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 157691
50.0%
2 148398
47.1%
9 6894
 
2.2%
1 2399
 
0.8%
Other Punctuation
ValueCountFrequency (%)
. 157691
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 473073
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 157691
33.3%
0 157691
33.3%
2 148398
31.4%
9 6894
 
1.5%
1 2399
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 473073
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 157691
33.3%
0 157691
33.3%
2 148398
31.4%
9 6894
 
1.5%
1 2399
 
0.5%

OUTRO_SIN
Categorical

MISSING 

Distinct3
Distinct (%)< 0.1%
Missing67384
Missing (%)29.7%
Memory size1.7 MiB
2.0
91667 
1.0
64556 
9.0
 
3156

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters478137
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row9.0
3rd row2.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
2.0 91667
40.4%
1.0 64556
28.5%
9.0 3156
 
1.4%
(Missing) 67384
29.7%

Length

2023-11-01T14:18:17.795416image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-01T14:18:18.302223image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
2.0 91667
57.5%
1.0 64556
40.5%
9.0 3156
 
2.0%

Most occurring characters

ValueCountFrequency (%)
. 159379
33.3%
0 159379
33.3%
2 91667
19.2%
1 64556
13.5%
9 3156
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 318758
66.7%
Other Punctuation 159379
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 159379
50.0%
2 91667
28.8%
1 64556
20.3%
9 3156
 
1.0%
Other Punctuation
ValueCountFrequency (%)
. 159379
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 478137
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 159379
33.3%
0 159379
33.3%
2 91667
19.2%
1 64556
13.5%
9 3156
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 478137
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 159379
33.3%
0 159379
33.3%
2 91667
19.2%
1 64556
13.5%
9 3156
 
0.7%

OUTRO_DES
Text

MISSING 

Distinct19018
Distinct (%)29.9%
Missing163149
Missing (%)71.9%
Memory size1.7 MiB
2023-11-01T14:18:19.269458image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length60
Median length54
Mean length14.280567
Min length1

Characters and Unicode

Total characters908444
Distinct characters64
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16465 ?
Unique (%)25.9%

Sample

1st rowFRAQUEZA,MAL ESTAR,MIALGIA
2nd rowCORIZA
3rd rowCORIZA
4th rowCORIZA
5th rowLESOES BOLHOSAS+HIPEREMIA
ValueCountFrequency (%)
coriza 20888
 
17.6%
nasal 5951
 
5.0%
dor 4379
 
3.7%
cefaleia 4147
 
3.5%
e 3298
 
2.8%
congestao 3262
 
2.8%
mialgia 2855
 
2.4%
toracica 2001
 
1.7%
de 1955
 
1.7%
inapetencia 1849
 
1.6%
Other values (9571) 67836
57.3%
2023-11-01T14:18:21.546522image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 136700
15.0%
I 91509
10.1%
O 87376
9.6%
C 70125
 
7.7%
R 67558
 
7.4%
E 67252
 
7.4%
55059
 
6.1%
S 53146
 
5.9%
N 44331
 
4.9%
T 36626
 
4.0%
Other values (54) 198762
21.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 824957
90.8%
Space Separator 55071
 
6.1%
Other Punctuation 26093
 
2.9%
Decimal Number 1199
 
0.1%
Math Symbol 876
 
0.1%
Dash Punctuation 154
 
< 0.1%
Open Punctuation 45
 
< 0.1%
Close Punctuation 42
 
< 0.1%
Other Symbol 3
 
< 0.1%
Other Number 2
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 136700
16.6%
I 91509
11.1%
O 87376
10.6%
C 70125
8.5%
R 67558
8.2%
E 67252
8.2%
S 53146
 
6.4%
N 44331
 
5.4%
T 36626
 
4.4%
L 29780
 
3.6%
Other values (16) 140554
17.0%
Other Punctuation
ValueCountFrequency (%)
, 20097
77.0%
. 2354
 
9.0%
/ 1937
 
7.4%
" 848
 
3.2%
; 669
 
2.6%
% 68
 
0.3%
* 47
 
0.2%
? 36
 
0.1%
: 26
 
0.1%
' 6
 
< 0.1%
Other values (2) 5
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
2 189
15.8%
0 158
13.2%
4 156
13.0%
3 148
12.3%
5 127
10.6%
8 111
9.3%
6 111
9.3%
9 76
6.3%
1 67
 
5.6%
7 56
 
4.7%
Math Symbol
ValueCountFrequency (%)
+ 863
98.5%
| 4
 
0.5%
~ 3
 
0.3%
= 3
 
0.3%
< 2
 
0.2%
> 1
 
0.1%
Space Separator
ValueCountFrequency (%)
55059
> 99.9%
  12
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 41
97.6%
] 1
 
2.4%
Dash Punctuation
ValueCountFrequency (%)
- 154
100.0%
Open Punctuation
ValueCountFrequency (%)
( 45
100.0%
Other Symbol
ValueCountFrequency (%)
° 3
100.0%
Other Number
ValueCountFrequency (%)
² 2
100.0%
Modifier Symbol
ValueCountFrequency (%)
^ 1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 824957
90.8%
Common 83487
 
9.2%

Most frequent character per script

Common
ValueCountFrequency (%)
55059
65.9%
, 20097
 
24.1%
. 2354
 
2.8%
/ 1937
 
2.3%
+ 863
 
1.0%
" 848
 
1.0%
; 669
 
0.8%
2 189
 
0.2%
0 158
 
0.2%
4 156
 
0.2%
Other values (28) 1157
 
1.4%
Latin
ValueCountFrequency (%)
A 136700
16.6%
I 91509
11.1%
O 87376
10.6%
C 70125
8.5%
R 67558
8.2%
E 67252
8.2%
S 53146
 
6.4%
N 44331
 
5.4%
T 36626
 
4.4%
L 29780
 
3.6%
Other values (16) 140554
17.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 908425
> 99.9%
None 19
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 136700
15.0%
I 91509
10.1%
O 87376
9.6%
C 70125
 
7.7%
R 67558
 
7.4%
E 67252
 
7.4%
55059
 
6.1%
S 53146
 
5.9%
N 44331
 
4.9%
T 36626
 
4.0%
Other values (50) 198743
21.9%
None
ValueCountFrequency (%)
  12
63.2%
° 3
 
15.8%
² 2
 
10.5%
¿ 2
 
10.5%

VACINA
Categorical

MISSING 

Distinct3
Distinct (%)< 0.1%
Missing106570
Missing (%)47.0%
Memory size1.7 MiB
9.0
59522 
2.0
45950 
1.0
14721 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters360579
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row9.0
3rd row2.0
4th row1.0
5th row2.0

Common Values

ValueCountFrequency (%)
9.0 59522
26.2%
2.0 45950
20.3%
1.0 14721
 
6.5%
(Missing) 106570
47.0%

Length

2023-11-01T14:18:22.160475image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-01T14:18:22.668673image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
9.0 59522
49.5%
2.0 45950
38.2%
1.0 14721
 
12.2%

Most occurring characters

ValueCountFrequency (%)
. 120193
33.3%
0 120193
33.3%
9 59522
16.5%
2 45950
 
12.7%
1 14721
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 240386
66.7%
Other Punctuation 120193
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 120193
50.0%
9 59522
24.8%
2 45950
 
19.1%
1 14721
 
6.1%
Other Punctuation
ValueCountFrequency (%)
. 120193
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 360579
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 120193
33.3%
0 120193
33.3%
9 59522
16.5%
2 45950
 
12.7%
1 14721
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 360579
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 120193
33.3%
0 120193
33.3%
9 59522
16.5%
2 45950
 
12.7%
1 14721
 
4.1%

VACINA_COV
Categorical

Distinct3
Distinct (%)< 0.1%
Missing56
Missing (%)< 0.1%
Memory size1.7 MiB
1.0
114017 
2.0
108403 
9.0
 
4287

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters680121
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0 114017
50.3%
2.0 108403
47.8%
9.0 4287
 
1.9%
(Missing) 56
 
< 0.1%

Length

2023-11-01T14:18:23.220372image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-01T14:18:23.727936image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
1.0 114017
50.3%
2.0 108403
47.8%
9.0 4287
 
1.9%

Most occurring characters

ValueCountFrequency (%)
. 226707
33.3%
0 226707
33.3%
1 114017
16.8%
2 108403
15.9%
9 4287
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 453414
66.7%
Other Punctuation 226707
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 226707
50.0%
1 114017
25.1%
2 108403
23.9%
9 4287
 
0.9%
Other Punctuation
ValueCountFrequency (%)
. 226707
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 680121
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 226707
33.3%
0 226707
33.3%
1 114017
16.8%
2 108403
15.9%
9 4287
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 680121
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 226707
33.3%
0 226707
33.3%
1 114017
16.8%
2 108403
15.9%
9 4287
 
0.6%

DOSE_2REF
Date

MISSING 

Distinct462
Distinct (%)1.2%
Missing189217
Missing (%)83.4%
Memory size1.7 MiB
Minimum2021-01-03 00:00:00
Maximum2023-12-04 00:00:00
2023-11-01T14:18:24.300892image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-01T14:18:24.976076image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

CLASSI_FIN
Categorical

MISSING 

Distinct5
Distinct (%)< 0.1%
Missing17353
Missing (%)7.7%
Memory size1.7 MiB
4.0
118732 
2.0
40115 
5.0
34688 
1.0
13128 
3.0
 
2747

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters628230
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4.0
2nd row4.0
3rd row5.0
4th row4.0
5th row4.0

Common Values

ValueCountFrequency (%)
4.0 118732
52.4%
2.0 40115
 
17.7%
5.0 34688
 
15.3%
1.0 13128
 
5.8%
3.0 2747
 
1.2%
(Missing) 17353
 
7.7%

Length

2023-11-01T14:18:25.623716image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-01T14:18:26.197191image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
4.0 118732
56.7%
2.0 40115
 
19.2%
5.0 34688
 
16.6%
1.0 13128
 
6.3%
3.0 2747
 
1.3%

Most occurring characters

ValueCountFrequency (%)
. 209410
33.3%
0 209410
33.3%
4 118732
18.9%
2 40115
 
6.4%
5 34688
 
5.5%
1 13128
 
2.1%
3 2747
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 418820
66.7%
Other Punctuation 209410
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 209410
50.0%
4 118732
28.3%
2 40115
 
9.6%
5 34688
 
8.3%
1 13128
 
3.1%
3 2747
 
0.7%
Other Punctuation
ValueCountFrequency (%)
. 209410
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 628230
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 209410
33.3%
0 209410
33.3%
4 118732
18.9%
2 40115
 
6.4%
5 34688
 
5.5%
1 13128
 
2.1%
3 2747
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 628230
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 209410
33.3%
0 209410
33.3%
4 118732
18.9%
2 40115
 
6.4%
5 34688
 
5.5%
1 13128
 
2.1%
3 2747
 
0.4%

CLASSI_OUT
Text

MISSING 

Distinct485
Distinct (%)24.4%
Missing224773
Missing (%)99.1%
Memory size1.7 MiB
2023-11-01T14:18:27.278159image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length30
Median length27
Mean length14.062814
Min length2

Characters and Unicode

Total characters27985
Distinct characters46
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique370 ?
Unique (%)18.6%

Sample

1st rowMICOBACTERIA TUBERCULOSE
2nd rowTUBERCULOSE
3rd rowCRISE ASMATICA
4th rowBRONQUITE
5th rowDISPNEIA
ValueCountFrequency (%)
pneumonia 759
23.5%
bacteriana 144
 
4.5%
bronquiolite 101
 
3.1%
tuberculose 101
 
3.1%
virus 96
 
3.0%
sincicial 92
 
2.9%
asma 89
 
2.8%
pnm 83
 
2.6%
respiratorio 75
 
2.3%
58
 
1.8%
Other values (472) 1626
50.4%
2023-11-01T14:18:29.158785image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 2867
10.2%
N 2827
10.1%
I 2800
10.0%
E 2555
 
9.1%
O 2437
 
8.7%
U 1969
 
7.0%
R 1645
 
5.9%
P 1512
 
5.4%
M 1441
 
5.1%
C 1353
 
4.8%
Other values (36) 6579
23.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 26504
94.7%
Space Separator 1240
 
4.4%
Other Punctuation 147
 
0.5%
Decimal Number 46
 
0.2%
Dash Punctuation 26
 
0.1%
Math Symbol 19
 
0.1%
Close Punctuation 2
 
< 0.1%
Open Punctuation 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 2867
10.8%
N 2827
10.7%
I 2800
10.6%
E 2555
9.6%
O 2437
9.2%
U 1969
 
7.4%
R 1645
 
6.2%
P 1512
 
5.7%
M 1441
 
5.4%
C 1353
 
5.1%
Other values (16) 5098
19.2%
Decimal Number
ValueCountFrequency (%)
1 13
28.3%
3 8
17.4%
8 6
13.0%
0 5
 
10.9%
6 4
 
8.7%
4 4
 
8.7%
9 3
 
6.5%
5 3
 
6.5%
Other Punctuation
ValueCountFrequency (%)
/ 59
40.1%
, 48
32.7%
. 35
23.8%
? 3
 
2.0%
; 2
 
1.4%
Math Symbol
ValueCountFrequency (%)
+ 18
94.7%
= 1
 
5.3%
Close Punctuation
ValueCountFrequency (%)
] 1
50.0%
) 1
50.0%
Space Separator
ValueCountFrequency (%)
1240
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 26
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 26504
94.7%
Common 1481
 
5.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 2867
10.8%
N 2827
10.7%
I 2800
10.6%
E 2555
9.6%
O 2437
9.2%
U 1969
 
7.4%
R 1645
 
6.2%
P 1512
 
5.7%
M 1441
 
5.4%
C 1353
 
5.1%
Other values (16) 5098
19.2%
Common
ValueCountFrequency (%)
1240
83.7%
/ 59
 
4.0%
, 48
 
3.2%
. 35
 
2.4%
- 26
 
1.8%
+ 18
 
1.2%
1 13
 
0.9%
3 8
 
0.5%
8 6
 
0.4%
0 5
 
0.3%
Other values (10) 23
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27985
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 2867
10.2%
N 2827
10.1%
I 2800
10.0%
E 2555
 
9.1%
O 2437
 
8.7%
U 1969
 
7.0%
R 1645
 
5.9%
P 1512
 
5.4%
M 1441
 
5.1%
C 1353
 
4.8%
Other values (36) 6579
23.5%

Interactions

2023-11-01T14:17:45.192397image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-01T14:17:44.387220image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-01T14:17:45.569894image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-11-01T14:17:44.797175image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Correlations

2023-11-01T14:18:29.767890image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
SEM_PRINU_IDADE_NESTRANGSG_UFCS_ZONAFEBRETOSSEGARGANTADISPNEIADESC_RESPSATURACAODIARREIAVOMITODOR_ABDFADIGAPERD_OLFTPERD_PALAOUTRO_SINVACINAVACINA_COVCLASSI_FIN
SEM_PRI1.000-0.0730.0090.0770.0410.0480.0640.0230.0260.0450.0220.0180.0140.0190.0260.0280.0280.0230.0670.0700.137
NU_IDADE_N-0.0731.0000.0500.0750.0420.1790.1630.0910.0560.1000.0870.0280.0850.0400.1150.0650.0600.0390.0930.4870.236
ESTRANG0.0090.0501.0000.1130.0120.0000.0080.0040.0010.0070.0000.0070.0030.0000.0060.0000.0000.0000.0130.0210.019
SG_UF0.0770.0750.1131.0000.2530.1350.0940.1430.1160.0870.1490.0670.0810.0810.1280.0860.0900.1640.1870.1350.157
CS_ZONA0.0410.0420.0120.2531.0000.0340.0280.0220.0350.0350.0240.0080.0110.0160.0190.0090.0110.0390.0730.0510.061
FEBRE0.0480.1790.0000.1350.0341.0000.4350.2810.2870.3290.2840.3890.3860.2750.2870.2290.2300.2040.0430.1200.087
TOSSE0.0640.1630.0080.0940.0280.4351.0000.2960.3340.3520.2800.3830.3760.2740.2910.2310.2310.2130.0480.1140.104
GARGANTA0.0230.0910.0040.1430.0220.2810.2961.0000.3110.2720.2550.3560.3530.5180.4230.5060.5010.2320.0770.0770.083
DISPNEIA0.0260.0560.0010.1160.0350.2870.3340.3111.0000.5100.4290.3570.3430.3390.3100.2800.2780.1760.0390.0100.065
DESC_RESP0.0450.1000.0070.0870.0350.3290.3520.2720.5101.0000.5050.4080.4020.3020.3500.2560.2570.2060.0400.0810.094
SATURACAO0.0220.0870.0000.1490.0240.2840.2800.2550.4290.5051.0000.3700.3650.2770.3160.2360.2400.2080.0700.0350.053
DIARREIA0.0180.0280.0070.0670.0080.3890.3830.3560.3570.4080.3701.0000.6390.4530.4380.3510.3500.3000.0620.0280.028
VOMITO0.0140.0850.0030.0810.0110.3860.3760.3530.3430.4020.3650.6391.0000.4720.4410.3500.3500.3090.0580.0510.031
DOR_ABD0.0190.0400.0000.0810.0160.2750.2740.5180.3390.3020.2770.4530.4721.0000.5040.5240.5190.2560.0660.0490.057
FADIGA0.0260.1150.0060.1280.0190.2870.2910.4230.3100.3500.3160.4380.4410.5041.0000.4790.4750.2770.0600.0850.067
PERD_OLFT0.0280.0650.0000.0860.0090.2290.2310.5060.2800.2560.2360.3510.3500.5240.4791.0000.7310.2570.0590.0610.063
PERD_PALA0.0280.0600.0000.0900.0110.2300.2310.5010.2780.2570.2400.3500.3500.5190.4750.7311.0000.2610.0630.0560.066
OUTRO_SIN0.0230.0390.0000.1640.0390.2040.2130.2320.1760.2060.2080.3000.3090.2560.2770.2570.2611.0000.1120.0440.053
VACINA0.0670.0930.0130.1870.0730.0430.0480.0770.0390.0400.0700.0620.0580.0660.0600.0590.0630.1121.0000.1380.061
VACINA_COV0.0700.4870.0210.1350.0510.1200.1140.0770.0100.0810.0350.0280.0510.0490.0850.0610.0560.0440.1381.0000.261
CLASSI_FIN0.1370.2360.0190.1570.0610.0870.1040.0830.0650.0940.0530.0280.0310.0570.0670.0630.0660.0530.0610.2611.000

Missing values

2023-11-01T14:17:46.531784image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-11-01T14:17:49.120159image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-11-01T14:17:54.975098image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

DT_SIN_PRISEM_PRIESTRANGNU_IDADE_NSG_UFCS_ZONAOUT_ANIMFEBRETOSSEGARGANTADISPNEIADESC_RESPSATURACAODIARREIAVOMITODOR_ABDFADIGAPERD_OLFTPERD_PALAOUTRO_SINOUTRO_DESVACINAVACINA_COVDOSE_2REFCLASSI_FINCLASSI_OUT
017/01/202332.075MG1.0NaN1.01.02.01.01.01.02.01.02.01.02.02.02.0NaN1.01.011/04/20224.0NaN
101/01/202312.067RJ1.0NaN1.09.09.09.09.09.09.09.09.09.09.09.09.0NaN9.01.010/05/20224.0NaN
205/01/20231NaN72SP1.0NaN2.01.02.01.02.01.02.02.02.02.02.02.02.0NaNNaN1.019/04/20225.0NaN
318/01/202332.046SP1.0NaN1.02.0NaN1.01.01.02.02.02.02.02.02.01.0FRAQUEZA,MAL ESTAR,MIALGIA2.01.0NaN4.0NaN
403/02/202352.071SPNaNNaNNaN1.02.01.01.02.02.02.02.02.02.02.0NaNNaN1.01.025/05/20224.0NaN
502/02/202352.07RJ1.0NaN2.01.02.02.02.02.02.02.02.02.02.02.01.0CORIZA2.01.0NaN4.0NaN
625/01/202342.01BA1.0NaN1.02.02.01.01.01.02.02.02.02.02.02.02.0NaN9.02.0NaN4.0NaN
722/02/202382.04PR9.0NaNNaN1.0NaN1.0NaN1.0NaNNaNNaNNaNNaNNaN1.0CORIZANaN2.0NaN2.0NaN
809/02/202362.08ALNaNNaN1.01.02.01.01.01.02.02.02.02.02.02.01.0CORIZANaN1.0NaN4.0NaN
903/01/202312.086PR2.0NaN1.02.0NaN1.01.01.02.02.02.01.02.02.0NaNNaNNaN1.018/04/20224.0NaN
DT_SIN_PRISEM_PRIESTRANGNU_IDADE_NSG_UFCS_ZONAOUT_ANIMFEBRETOSSEGARGANTADISPNEIADESC_RESPSATURACAODIARREIAVOMITODOR_ABDFADIGAPERD_OLFTPERD_PALAOUTRO_SINOUTRO_DESVACINAVACINA_COVDOSE_2REFCLASSI_FINCLASSI_OUT
22675311/08/2023322.079CE1.0NaN2.01.02.02.01.02.02.02.02.02.02.02.02.0NaNNaN2.0NaN4.0NaN
22675428/09/2023392.040SP1.0NaN2.01.02.01.02.02.02.02.02.02.02.02.02.0NaNNaN2.0NaN5.0NaN
22675510/09/2023372.07DF3.0NaN1.01.0NaNNaNNaN1.0NaN1.0NaNNaNNaNNaNNaNNaNNaN2.0NaN5.0NaN
22675616/09/2023372.02CE1.0NaN1.01.01.01.01.02.02.02.02.02.02.02.0NaNNaNNaN1.0NaNNaNNaN
22675715/09/2023372.03MS1.0NaN1.01.02.01.01.01.02.02.02.01.02.02.01.0CONG. NASAL.RINORREIA.PROSTACA9.02.0NaN4.0NaN
22675817/09/2023382.03SP1.0NaN1.01.0NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2.0NaN5.0NaN
22675910/07/2023282.05PRNaNNaN2.01.02.02.02.02.02.02.02.02.02.02.01.0ESFORCO RESP, ROUQUIDAO9.02.0NaN4.0NaN
22676030/07/2023312.02RS1.0NaN2.01.02.01.02.02.02.02.02.02.02.02.01.0DIFICULDADE PARA MAMAR9.02.0NaN2.0NaN
22676111/09/202337NaN11DF1.0NaN1.01.0NaN1.0NaN1.01.0NaNNaNNaNNaNNaNNaNNaNNaN2.0NaN4.0NaN
22676204/10/2023402.020SP1.0NaN2.02.02.02.02.02.02.02.02.02.02.02.01.0PARTO CESAREA ASSINTOMATICA2.01.0NaN5.0NaN

Duplicate rows

Most frequently occurring

DT_SIN_PRISEM_PRIESTRANGNU_IDADE_NSG_UFCS_ZONAOUT_ANIMFEBRETOSSEGARGANTADISPNEIADESC_RESPSATURACAODIARREIAVOMITODOR_ABDFADIGAPERD_OLFTPERD_PALAOUTRO_SINOUTRO_DESVACINAVACINA_COVDOSE_2REFCLASSI_FINCLASSI_OUT# duplicates
2101/05/2023182.026BA1.0NaN2.02.02.01.01.01.02.02.02.02.02.02.02.0NaNNaN2.0NaN2.0NaN4
2601/06/2023222.02SP3.0NaN2.01.0NaN1.01.0NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2.0NaN4.0NaN4
3701/09/202335NaN57DF1.0NaN1.01.02.01.01.02.02.02.02.02.02.02.02.0NaN9.01.003/06/20221.0NaN4
4001/10/2023402.064SP1.0NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN9.01.007/06/20225.0NaN4
4102/01/202312.034PR1.0NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN1.0NaN4.0NaN4
4402/02/20235NaN85PE1.0NaNNaN1.0NaN1.01.01.0NaNNaNNaNNaNNaNNaNNaNNaNNaN1.0NaN5.0NaN4
6002/07/2023272.05PA2.0NaN1.01.0NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2.02.0NaN4.0NaN4
6102/07/2023272.043TONaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN1.0NaN5.0NaN4
6202/07/2023272.090TO1.0NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2.0NaN5.0NaN4
7603/05/2023182.051SC1.0NaN1.01.01.01.02.0NaN2.02.02.02.02.02.0NaNNaNNaN1.0NaN5.0NaN4